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1.   Introduction 

Current  databases  of  analysts'  forecasts  of  corporate  earnings  include 
predictions  from  thousands  of  individuals  employed  at  hundreds  of  financial 
service  institutions.   The  purpose  of  this  paper  is  to  analyze  whether  it  is 
possible  to  distinguish  forecasters  with  superior  ability  on  the  basis  of  ex 
post  forecast  accuracy  from  panel  data. 

Financial  press  coverage  suggests  that  there  are  superior  financial 
analysts.   An  example  of  this  coverage  is  the  Institutional  Investor  "All 
American  Research  Team."  This  ranking  is  based  on  surveys  of  money  managers, 
who  nominate  and  evaluate  analysts  on  a  variety  of  criteria,  including  earnings 
forecasting,  ability  to  pick  stocks,  and  the  quality  of  written  financial 
analysis  reports.   Clearly  services  other  than  forecast  accuracy  are  provided 
by  financial  analysts  and  valued  by  their  clients.   The  reasons  for  focusing  on 
only  one  activity,  earnings  forecasting,  are  twofold.   First,  it  is  possible  to 
evaluate  forecast  accuracy  objectively.   Second,  academic  use  of  analysts 
forecasts  as  earnings  expectations  data  in  capital  markets  empirical  research 
is  now  widespread,   although  some  properties  of  the  data  remain  to  be  explored. 

The  primary  use  of  analysts'  earnings  forecasts  in  academic  work  is  to 
provide  a  proxy  for  the  "market"  expectation  of  a  future  earnings  realization.  " 
It  is  common  to  use  aggregations  of  forecasts  such  as  the  mean  or  median  for 
this  purpose.   In  previous  work,   I  have  demonstrated  that  the  most  current 
forecast  available  is  at  least  as  good  a  proxy  for  the  consensus  as  either  of 
these  simple  aggregations.   The  two  different  approaches  to  consensus,  simple 


Some  recent  examples  are:   Patell  and  Wolfson  (1984),  Ricks  and  Hughes 
(1985),  Bamber  (1986),  Hoskin,  Hughes  and  Ricks  (1986),  and  Pound  (1987). 

^O'Brien  (1988a). 
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aggregations  of  all  available  forecasts  on  the  one  hand  versus  the  single  most 
current  on  the  other,  represent  extreme  assumptions  about  the  underlying 
individual  forecast  data.   If  dispersion  in  forecasts  for  a  given  firm  and  year 
is  primarily  attributable  to  individual  idiosyncratic  error,  then  combining 
forecasts  should  improve  accuracy  by  "diversifying"  across  these  idiosyncratic 
components.   Alternatively,  since  the  set  of  analysts'  forecasts  available  at 
any  one  time  may  range  from  very  current  to  many  months  old,  it  may  be  that 
dispersion  in  forecasts  is  primarily  attributable  to  differences  in  forecast 
ages,^and  therefore  to  differences  in  the  information  impounded  in  them.   In 
the  extreme,  if  each  analyst  aggregates  the  information  in  previous  forecasts 
and  updates  to  reflect  current  information  without  adding  noise  to  process, 
then  the  most  current  forecast  may  be  the  appropriate  aggregation. 

Under  either  of  the  above  two  scenarios,  analysts  are  implicitly  assumed  to 
be  approximately  of  the  same  forecasting  ability.'   If  instead  some  forecasters 
are  known  to  have  consistently  superior  (or  consistently  inferior)  forecasting 
ability,  then  use  of  this  knowledge  can  improve  the  accuracy  of  the  consensus. 
In  the  first  scenario,  where  differences  in  forecasts  reflect  idiosyncratic 
errors,  in  circumstances  where  the  loss  function  is  quadratic,  precision- 
weighting  provides  optimal  forecasts  if  analysts  differ  in  ability.    In  the 
second  scenario,  where  differences  in  forecasts  reflect  differences  in  the 
information  set  available  at  various  times  in  the  past,  there  will  be  a 
tradeoff  between  the  age  of  the  forecast  and  the  ability  of  the  forecaster.   In 
either  of  these  cases,  the  results  of  this  research  on  the  forecast  accuracy  of 
individuals  are  relevant. 


See,  for  example,  Newbold  and  Granger  (1974),  Ashton  and  Ashton  (1985), 
Agnew  (1985),  and  Clemen  and  Winkler  (1986).   For  a  discussion  of  the 
difficulties  encountered  when  precisions  are  not  stable,  see  Kang  (1986). 
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Overall,  the  results  of  this  study  support  the  contention  that  analysts  do 
not  differ  systematically  in  forecast  accuracy/  I  examine  a  sample  of 
forecasts  for  firms  in  nine  different  2-digit  SIC  industries  over  the  period 
1975  to  1981.   The  analysis  proceeds  along  two  lines.   First,  a  fixed  effects 
model  of  the  forecast  accuracy  is  estimated.   The  purpose  of  this  model  is  to 
test  whether,  conditional  on  the  firms  and  years  in  the  sample,  there  is 
heterogeneity  among  analysts  in  forecast  accuracy.   Since  individual  analysts 
do  not  predict  earnings  for  all  firms  in  an  industry,  and  typically  individuals 
appear  in  the  database  for  less  than  the  full  sever  years,  it  is  important  to 
control  for  firm-specific  and  year-specific  differences  in  predictability.   In 
each  of  the  nine  industries,  the  fixed  effects  model  fails  to  reject  the 
hypothesis  that  analysts  are  homogeneous,  conditional  on  the  firms  and  years 
they  forecast. 

Second,  because  the  error  terms  in  the  fixed  effects  model  are  severely  non- 
normal,  a  non-parametric  approach  is  taken.  The  non-parametric  tests  compare, 
for  each  industry,  the  observed  distribution  of  analysts'  average  (through 
time)  ranks  with  the  distribution  which  would  be  expected  if  all  analysts  are 
alike,  and  each  year  is  an  independent  observation.   The  non-parametric  tests 
fail  to  reject  the  hypothesis  that  the  observed  distribution  is  identical  to 
the  expected  distribution,  in  eight  of  the  nine  industries.   Continuing 
research  will  investigate  the  source  of  the  differences  in  the  single  industry 
in  which  the  null  hypothesis  is  rejected. 

The  remainder  of  the  paper  is  organized  as  follows.   In  section  2,  I 
describe  the  sample  selection  process  and  some  characteristics  of  the  sample  of 
analysts  forecasts.   In  section  3,  the  fixed  effects  model  is  introduced,  and 
the  results  of  parametric  tests  explained.   In  section  4  I  describe  the 
construction  of  the  non-parametric  tests  using  a  single  industry  as  an  example. 
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and  present  results  for  all  nine  industries.   Section  5  is  a  discussion,  with 
remarks  about  extensions  and  future  work. 

2.   Description  of  Sample 

The  nine-industry  sample  of  forecasts  used  in  this  paper  is  selected  from  a 
database  of  individual  forecasts  from  Institutional  Brokers  Estimate  System 
(hereafter,  I/B/E/S).   The  individual  forecasts  are  the  detail  data  from  which 
monthly  summaries  are  computed.   Monthly  summary  data  are  sold  to  I/B/E/S' 
clients  and  have  been  analyzed  extensively  by  Brown,  Foster  and  Noreen  (1985). 
The  identities  of  the  analysts  and  brokerage  houses  included  in  the  database 
are  encoded  numerically.   I  do  not  have  access  to  the  names  of  individuals  or 
brokerage  houses  underlying  the  codes. 

The  rationale  for  limiting  the  study  to  a  set  of  nine  industries  is  to  allow 
closer  examination  of  within-industry  variation.   Analysts  tend  to  concentrate 
their  forecasting  activities  within  industries,  becoming  specialists  in 
particular  types  of  firms.   Popular  press  evaluations  of  analysts  reflect  this 
specialization.   For  example.  Institutional  Investor  ranks  analysts  within 
industries  in  its  annual  survey.  The  fact  that  analysts  tend  to  concentrate 
within  industries  is  evident  in  the  sample  used  here:   of  404  analysts,  only  12 
are  included  in  more  than  one  of  the  nine  two-digit  industries. 

The  industries  chosen  for  this  study  are  the  nine  industries  with  the 
largest  number  of  forecasts  available  in  the  database.   Industry  affiliation  is 


In  earlier  work  [O'Brien  (1988b)]  I  have  found  heterogeneity  across  2- 
digit  SIC  industries  to  explain  more  of  the  variation  in  forecast  errors  than 
heterogeneity  across  firms,  ignoring  industry  effects.   An  efficient  approach 
to  estimating  the  structure  described  here  would  be  to  allow  for  simultaneous 
estimation  across  industries,  as  well  as  across  firms  within  industries. 
However,  the  computer  resources  necessary  for  this  estimation  exceed  those 
available  to  me  at  reasonable  cost. 
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determined  by  two-digit  Standard  Industrial  Classification  (SIC)  codes,  as 
reported  in  the  1982  version  of  the  COMPUSTAT  tapes.   Industry  names  and  some 
characteristics  of  the  sample  are  reported  in  Table  1  and  discussed  below.   The 
sample  includes  service  (SIC  49  -  Electrical,  gas  and  sanitary  services)  and 
financial  firms  (SIC  60  -  Banking,  SIC  63  -  Insurance)  as  well  as  a  range  of 
manufacturing  firms. 

The  firms  included  in  the  study  are  those  in  the  nine  industries  listed  in 
Table  1  with  December  year  ends,  with  forecasts  in  the  I/B/E/S  database  in  each 
year  from  1975  through  1981.   The  sample  is  selected  at  a  forecast  horizon  of 
120  trading  days,  or  about  six  calendar  months,  prior  to  the  announcement  of 
annual  earnings,  for  each  firm  and  year.   The  most  recent  forecast  from  each 
analyst  is  used,  where  "most  recent"  is  determined  by  examining  the  analyst's 
forecast  date.   The  analysts  included  in  the  study  are  those  with  at  least  ten 
forecasts  available  for  firms  in  the  industry,  and  with  forecasts  in  at  least 
three  of  the  seven  years  (1975  -  1981)  covered  by  the  database. 

The  restrictions  on  the  number  of  forecasts  from  each  analyst  are  designed 
to  ensure  that  sufficient  data  are  available  for  reliable  statistical 
inference.   It  is  interesting  to  note  in  Table  1  that,  while  the  set  of 
analysts  examined  is  still  reasonably  large,  these  requirements  eliminate  more 
than  three-quarters  of  the  analysts  available  in  I/B/E/S  for  each  industry  from 
consideration.   This  fact  may  reflect  features  of  the  market  for  financial 
analysts,  and  of  the  database.   Analysts  who  change  brokerage  houses  within  the 
sample  period  cannot  be  tracked  to  their  new  jobs,  even  if  they  remain  in  the 
database.   An  analyst  who  changes  brokerage  houses  appears  as  a  new  analyst  in 
the  data  codes.   Thus,  the  three-year  requirement  may  eliminate  analysts  with 
the  greatest  job-mobility  from  the  sample.  However,  note  that  the  analysts 
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remaining  in  the  sample  produce  a  disproportionately  large  share  of  the 
forecasts . 

Table  2  illustrates  that  there  is  considerable  variation  in  analyst 
following  across  the  nine  industries.   This  is  true  in  the  database,  and  is 
carried  over  into  the  sample  without  much  distortion.   Firms  in  the  Petroleum 
Refining  industry  (SIC  29)  are  most  closely-followed,  with  88.8  forecasts  per 
firm  (over  seven  years)  in  the  final  sample.   Firms  in  Banking  (SIC  60),  on  the 
other  hand,  have  approximately  half  this  following,  with  44.3  forecasts  per 
firm  in  the  final  sample.   The  proportions  of  the  sample  in  each  industry  are 
not  altered  dramatically  by  the  sample  selection  procedure. 

Earnings  data  for  this  study  are  obtained  from  COHPUSTAT,  and  are  primary 
earnings  per  share  (EPS)  before  extraordinary  items.   When  forecasts  are  made 
for  fully-diluted  EPS,  this  is  flagged  in  the  database.   I  convert  these  fully- 
diluted  forecasts  to  primary  using  the  ratio  of  primary  to  fully-diluted  EPS 
for  that  firm  and  year  from  COMPUSTAT.   When  a  stock  split  or  dividend  is 
announced  after  a  forecast  was  made,  I  adjust  the  share  basis  of  the  EPS 
forecast  using  distribution  data  from  the  CRSP  Master  file. 

3.   Parametric  Tests  of  Forecast  Accuracy 

The  forecast  accuracy  metric  used  to  compare  analysts  is  average  absolute 
forecast  error,   where  absolute  forecast  error  is  defined: 


e.  .   I  =  I  A.^  -  F.  .   I   .  •      (1) 

ijt  '    '   3t    ijt  ' 


Average  squared  forecast  error  is  another  commonly-used  accuracy 
criterion.  However,  in  these  data,  use  of  squared  forecast  errors  results  in 
extremely  skewed  and  fat-tailed  distributions,  amplifying  problems  of  non- 
normality  that  affect  statistical  inferences.   This  point  is  discussed  in 
greater  detail  below. 
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In  (1),  h'^   denotes  actual  EPS  for  firm  j  in  year  t,  and  F;.;*.  denotes  the 
forecast  of  A'^^  from  analyst  i  at  a  horizon  of  120  trading  days  prior  to  the 
annual  earnings  announcement. 

Analyst  forecast  data  are  extremely  unbalanced.   The  term  "unbalanced" 
refers  to  the  fact  that  there  is  not,  in  general,  a  forecast  from  each  analyst 
for  each  firm  in  each  year  in  the  sample.   This  pervasive  feature  of  the 
population  of  analyst  forecasts  is  carried  over  into  the  sample,  even  after 
imposing  the  data  sufficiency  requirements  described  above.   To  address  the 
lack  of  balance  in  the  sample,  a  fixed  effects  model  is  used  to  estimate 
average  accuracy.   The  model  is: 


e.  ..  I  =  y.  +  6.  +  y.  +  n.  ..    ,  (2) 

ijt  '   1   :)   t   ijt 


where  ]!■,    6'  and  TS^   are  analyst-,  firm-  and  year-effects,  respectively.   The 
estimates  of  y^  are  interpreted  as  the  average  accuracy  for  analyst  i, 

conditional  on  the  firms  and  years  in  the  portfolio  of  predictions  from  analyst 

7 
1  m  the  sample.    Analysts  may  exercise  some  discretion  over  the  firms  and 

years  for  which  they  issue  forecasts.   This  endogenous  selection  is  important 


Unbalanced  data  and  its  implications  for  sample  selection  are  discussed 
in  greater  detail  in  O'Brien  {1988a)  and  (1988b). 
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It  is  common,  m  modelling  analysts'  forecast  errors,  to  scale  the  errors 

by  a  firm-specific  denominator  to  control  for  heterogeneity  in  predictability 
across  firms.   Since  equation  (2)  is  designed  explicitly  to  control  for  this 
heterogeneity,  it  is  less  important  here  to  do  so  by  transformation  of  the 
forecast  errors.  Moreover,  such  transformations  may  not  result  in  homogeneity 
across  firms.   For  example,  when  equation  (2)  was  estimated  with  the  dependent 
variable  defined  as  the  absolute  value  of  (forecast  error  divided  by  the 
previous  year's  actual  EPS),  the  hypothesis  of  homogeneity  across  firms  was 
rejected  at  the  .001  level  or  better  in  all  nine  industries.   Similar  results 
were  obtained  when  the  dependent  variable  was  defined  as  absolute  forecast 
errors,  scaled  by  a  five-year  average  of  the  absolute  value  of  EPS  changes.   In 
all  cases,  the  major  conclusion  of  this  paper  is  unaltered:   analysts  are 
indistinguishable  in  average  accuracy  conditional  on  firm  and  year  effects. 
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for  statistical  inferences  about  the  relative  accuracy  of  analysts  as  long  as 
firms  and  years  are  not  homogeneous  in  average  forecast  accuracy.   The 
estimated  firm-  and  year-effects  remove  the  average  effects  of  differences  in 
accuracy  across  years  and  across  firms. 

Differences  in  average  accuracy  across  years  are  expected  if  the 
unanticipated  events  occurring  in  the  last  half  (approximately)  of  the  year 
have  a  larger  magnitude  effect  on  corporate  earnings  in  some  years  than  in 
others.   Differences  in  average  accuracy  across  firms  are  expected  if  the 
earnings  of  some  firms  are  harder  to  predict  than  those  of  others.   Differences 
in  difficulty  of  prediction  across  firms  could  arise  because  of,  e.g., 
differences  in  earnings  volatility,  differences  in  company  disclosure  policies 
regarding  interim  and  non-earnings  information,  and  differences  in  the 
sensitivity  of  earnings  to  other  observable  data,  such  as  input  prices. 

Evidence  of  the  importance  of  firm  and  year  effects  can  be  seen  in  Table  3, 
where  the  results  of  estimating  equation  (2)  for  each  of  the  nine  industries  in 
the  sample  are  displayed.   In  all  nine  industries,  firm  effects  and  year 
effects  explain  significant  amounts  of  the  variation  in  forecast  accuracy.   The 
nominal  significance  levels  on  all  tests  are  smaller  than  .0001,  with  the 
exception  of  the  year  effects  in  industry  28,  Chemicals  and  allied  products. 
In  this  case,  year  effects  contributed  significant  explanatory  power  at  the  .06 
level . 

In  contrast  to  the  importance  of  the  year  and  firm  effects  in  Table  3, 
analyst  effects  do  not  contribute  significant  explanatory  power  to  the  model. 
Equivalently ,  the  hypothesis  that  analysts  are  homogeneous  in  average  forecast 
accuracy,  conditional  on  firm  and  year  effects,  cannot  be  rejected  for  any 
industry  in  the  sample.   Attempts  to  distinguish  superior  forecasters  which  do 
not  take  into  account  differences  in  the  portfolios  of  firms  or  the  years  for 
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which  the  analysts  make  predictions  may  lead  to  faulty  inferences.   The 
magnitude  of  this  possibility  is  studied  further  in  Table  4,  by  comparing  the 
results  from  estimating  equation  (2)  with  the  results  from  the  following  model: 

I  e,  . .  I  =  y .  +  u.  ..    .  (3) 

'   ijt  '    1    ijt  ^  ' 

In  (3),  fixed  effects  are  estimated  only  for  analysts,  and  firm  and  year 

effects  are  ignored.   The  coefficient  \i'    is  the  average  forecast  error,  across 

firms  and  years,  for  analyst  i.   Implicit  in  this  measure  is  the  assumption 

that  forecast  accuracy  is  homogeneous  for  each  analyst  across  firms  and  years. 

Measured  differences  in  forecast  accuracy  across  analysts  in  Panel  A  of  Table  4 

are  significant  at  the  10%  level  or  better  in  six  of  the  nine  industries. 

A  more  detailed  comparison  of  estimates  from  model  (2),  where  absolute 
accuracy  is  measured  conditional  on  firm  and  year  effects,  and  model  (3),  where 
firm  and  year  effects  are  ignored,  is  carried  out  in  Panel  B  of  Table  4.   The 
reported  numbers  include  the  range  of  estimated  values  measuring  analyst 
accuracy,  the  average  across  analysts  of  the  standard  errors  on  these 
estimates,  and  the  residual  standard  errors  from  each  set  of  regressions. 
Naturally,  the  three-factor  model,  equation  (2),  has  a  smaller  residual 
standard  error  than  the  one-factor  model,  equation  (3).   However,  the  average 
standard  error  on  analysts'  average  accuracy  is  not  much  different  between  the 
two  models,  and  in  all  industries  the  range  of  average  accuracies  is  larger  in 
the  one-factor  model. 

The  average  standard  errors  i   reported  in  Table  4  for  equation  (2) 
understates  the  average  of  standard  errors  which  would  be  used  for  comparisons 
between  and  among  analysts.   This  is  true  because  the  standard  error  of  the 
difference  between  two  analysts  depends  on  the  standard  errors  of  the  estimates 
for  each  analyst,  and  the  covariance  between  them.   In  model  (2),  the  regressor 
sums  of  squares  and  crossproducts  matrix  (usually  denoted  X'X]  is  simply  a 
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matrix  of  counts:   each  element  of  the  matrix  equals  the  number  of  observations 
corresponding  to  that  regressor  pair.   The  off-diagonal  elements  of  the  inverse 
of  this  matrix  are,  in  general,  non-positive,  so  they  increase  the  standard 
error  of  differences  between  coefficients.   In  contrast,  in  model  (3), 
covariances  between  estimates  for  different  analysts  are  assumed  uniformly  to 
be  zero. 

The  comparison  of  results  in  Tables  3  and  4  demonstrates  that  firm  and  year 
effects  affect  inferences  regarding  differences  in  average  accuracy  across 
analysts.   If  firm  and  year  effects  are  considered,  then  analysts  appear 
homogeneous  in  forecast  accuracy.   If  firm  and  year  effects  are  ignored,  then 
analysts  appear  to  have  heterogeneous  forecasting  ability  in  six  of  the  nine 
industries.   In  the  latter  case,  estimates  of  analysts'  average  accuracies 
cover  a  broader  range  than  in  the  former.   In  addition,  in  the  latter  case  the 
standard  errors  on  differences  between  analysts  are  generally  smaller,  because 
covariances  are  ignored. 

The  tests  used  in  the  above  regression  analysis  of  forecast  errors  depend  on 
the  assumption  that  the  error  terms  are  normally  distributed.   The  residuals 
from  regression  equation  (2),  however,  exhibit  severe  non-normality.  This  is 
evident  in  Table  5,  which  displays  skewness  and  kurtosis  coefficients  and  a 
test  for  normality  on  the  regression  residuals.   The  residuals  from  all 
regressions  are  highly  right-skewed,  and  highly  leptokurtic.   In  all 
industries,  the  null  hypothesis  that  the  residuals  come  from  a  Normal 

Q 

distribution  can  be  rejected  at  levels  smaller  than  .01.   The  extremely  fat- 


Q 

Both  the  log  and  the  square  root  transformation  were  used  in  attempts  to 
obtain  residual  distributions  closer  to  the  normal.   Both  these  transformations 
are  monotonic,  and  so  would  preserve  the  ordering  of  accuracies.  However, 
neither  transformation  resulted  in  normally-distributed  residuals.   In  both 
cases,  the  distributions  of  residuals  were  left-skewed  and  leptokurtic. 
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tails  of  the  residual  distributions  indicates  too-frequent  rejection. 
Therefore,  the  result  that  analysts  are  not  statistically  distinguishable  in 
forecast  accuracy  would  not  be  altered.   Extreme  skewness,  on  the  other  hand, 
has  ambiguous  implications  for  inferences.   Because  the  Normal  distribution 
which  underlies  the  parametric  tests  of  this  section  does  not  appear  to 
describe  analysts'  forecast  data,  non-parametric  tests  are  conducted  in  the 
next  section. 

4.  Non-parametric  Tests 

A  major  difficulty  in  devising  non-parametric  tests  for  analyst  forecast 
data  is  the  unbalanced  nature  of  the  data.   Specifically,  two  problems  arise. 
First,  there  is  evidence  above  that  firms  and  years  differ  in  predictability. 
Therefore,  it  is  not  obvious  how  to  rank-order  the  accuracy  of  two  forecasts 
if,  e.g.,  they  come  from  different  years.   Second,  firms  have  different  numbers 
of  analysts  forecasting  their  EPS,  so  aggregating  ranks  across  firms  for  a 
given  analyst  can  create  uninterpretable  results.   For  example,  the  rank  "third 
most  accurate"  should  be  interpreted  differently  if  there  are  twenty  analysts 
being  ranked  than  if  there  are  four.   I  have  attempted  to  address  both  these 
concerns  in  the  test  devised  below. 

The  non-parametric  test  is  based  on  the  Kolmogorov-Smirnov  statistic,  and 
indicates  whether  the  distribution  of  analysts'  ranks  is  the  same  as  the 
distribution  that  would  be  expected  if  all  ranks  were  equally  probable  for  each 
analyst  in  each  year.   For  expository  reasons,  I  describe  the  construction  of 
the  expected  distribution  in  detail  using  a  single  industry  (SIC  26,  Paper  and 
allied  products)  as  a  numerical  example,  and  then  discuss  the  results  of  the 
test  for  the  entire  nine-industry  sample. 
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There  are  29  analysts  in  the  Paper  industry  sample.   The  first  three  columns 
of  Table  6  display  the  identification  code,  the  number  of  years  in  the  sample, 
and  the  average  rank  for  each  of  these  29  analysts.   The  average  rank  is 
computed  as  follows.   For  each  year,  the  mean  (across  firms)  absolute  forecast 
error  is  computed  for  each  analyst  forecasting  EPS  in  the  Paper  industry. 
These  mean  absolute  forecast  errors  are  then  ranked  in  quartiles  within  years. 
Finally,  an  average  rank  is  computed  for  each  analyst,  over  all  years  in  which 
the  analyst  is  in  the  sample. 

Ranking  within  years  addresses  the  first  of  the  two  concerns  mentioned 

above,  that  EPS  differ  in  predictability  across  years.   Ranking  within  years 

q 
controls  for  year-specific  information  effects.   Quartiles,  rather  than  the 

complete  rank  order,  are  used  to  address  the  second  concern,  that  different 

numbers  of  analysts  predict  EPS  in  each  year.   Since  there  are  more  than  four 

analysts  in  each  industry  in  each  year,  quartiles  do  not  present  the  difficulty 

of  uninterpretable  aggregation.   A  drawback  of  the  use  of  quartiles  is  that 

information  on  analysts'  relative  positions  within  quartiles  is  lost. 

The  null  hypothesis  for  the  non-parametric  tests  is  based  on  the  assumption 

that  each  analyst  has  probability  .25  of  falling  in  each  of  the  four  quartiles 

in  each  year,  and  that  each  year  is  an  independent  observation.   An  analyst 

with  forecasts  in  three  different  years  can  have  ten  possible  average  ranks, 

ranging  from  1.0  to  4.0  as  displayed  in  the  first  column  of  Table  7.   More 

generally,  for  analysts  with  forecasts  in  T  years  there  are  3T  +  1  possible 

average  ranks,  or  outcomes.  There  are  4  possible  sequences,  or  paths,  of 


Ranking  within  years  does  not  control  for  the  fact  that  firms,  as  well  as 
years,  differ  in  predictability.   While  ranking  within  both  firms  and  years  is 
feasible  in  principle,  issues  of  aggregation  and  of  independence  become  more 
important  in  this  alternative  scheme.   I  return  to  these  points  later  in  this 
section. 
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length  T  leading  to  these  outcomes.   The  probability  of  any  particular  outcome 
under  the  null  hypothesis  is  simply  the  proportion  of  the  4  possible  paths 
which  result  in  that  outcome.   The  probabilities  associated  with  the  ten 
possible  average  ranks  for  an  analyst  with  forecasts  in  three  years  are  given 
in  the  second  column  of  Table  7.  The  expected  frequencies  in  the  third  column 
of  Table  7  are  computed  by  multiplying  the  probabilities  by  10,  the  number  of 
analysts  in  the  Paper  industry  with  forecasts  in  three  different  years.   This 
calculation  is  repeated  for  the  possible  outcomes  and  numbers  of  analysts  with 
forecasts  in  four,  five,  six  and  seven  years.   Finally,  the  expected 
frequencies  are  aggregated  into  a  single  distribution  of  expected  average 
ranks,  conditional  on  the  numbers  of  analysts  in  the  sample.   In  the  Paper 
industry  sample,  the  numbers  of  analysts  with  three,  four,  five,  six  and  seven 
years'  forecasts  are  10,  10,  5,  1  and  3,  respectively.   The  distribution  of 
expected  frequencies  is  converted  to  a  density  by  dividing  each  expected 
frequency  by  29,  the  number  of  analysts  in  the  sample  for  the  Paper  industry. 
The  density  and  cumulative  density  under  the  null  hypothesis  are  given  in  the 
third  and  fourth  columns  of  Table  8. 

The  test  statistic  is  constructed  by  comparing  the  empirical  density,  which 
assigns  a  probability  mass  of  (1/29)  or  0.03448  to  each  sample  observation,  to 
the  expected  density  under  the  null  hypothesis,  constructed  as  described  above. 
The  test  statistic  is: 

KS  =  [MN/(M+N)]-^   DJ^   ,  (4) 

* 
where  Dw»j  is  defined  as  the  maximum  distance,  over  all  sample  points,  between 

the  empirical  density  and  the  expected  density.   M  denotes  the  number  of 


See,  for  example,  DeGroot  (1986)  pp.  552-9  for  a  description  of 
Kolmogorov-Smirnov  tests. 
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possible  outcomes  in  the  expected  density.   For  the  problem  outlined  here, 
where  analysts  can  have  forecasts  in  three  to  seven  years,  M  equals  55. 

The  differences  between  the  empirical  and  expected  densities  at  each  sample 
point  are  given  in  the  last  column  of  Table  6.   The  maximum  difference  is  .08, 
which  leads  to  a  test  statistic  of  .37.   From  the  limiting  distribution  for  the 
KS  test  statistic,  reproduced  as  Panel  B  of  Table  9,  the  probability  of 
observing  a  value  of  .37  or  greater  is  more  than  .99.   Therefore,  the  observed 
distribution  of  average  ranks  for  analysts  in  the  Paper  industry  does  not 
differ  from  the  distribution  which  would  be  expected  if  each  analyst  had  a 
probability  .25  of  falling  in  each  quartile  in  each  year. 

In  Panel  A  of  Table  9,  the  results  of  the  Kolmogorov-Smirnov  test  for  all 
nine  industries  in  the  sample  are  reported.   In  eight  of  the  nine  industries, 
the  probability  of  obtaining  a  KS  value  at  least  as  high  as  the  one  reported  is 
approximately  .25  or  larger.   That  is,  for  eight  of  the  nine  industries,  the 
distribution  of  analysts'  forecasts  is  indistinguishable  from  the  distribution 
that  would  be  expected  if  all  analysts  were  identical,  with  independent  equal 
chances  of  being  ranked  in  each  of  the  four  quartiles  in  each  year. 

As  was  mentioned  above,  the  construction  of  the  non-parametric  test  does  not 
control  for  systematic  differences  across  firms  in  predictability.   Based  on 
the  results  reported  in  Table  3,  these  differences  appear  to  be  important  in 
most  industries.   If  differences  in  accuracy  among  analysts  are  in  part 
attributable  to  their  selections  of  firms  with  different  levels  of 
predictability,  then  a  test  which  ignores  firm  effects  may  bias  results  toward 
finding  differences  among  analysts.   For  example,  if  some  analysts 
systematically  forecast  EPS  for  a  portfolio  of  firms  with  relatively 
predictable  earnings,  and  others  systematically  forecast  firms  with  relatively 
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hard-to-predict  earnings,  the  former  group  will  have  a  higher  probability  of 
being  in  the  first  quartile  in  each  year  than  the  latter  group. 

Ignoring  systematic  differences  across  firms  in  predictability  is  not  a 
concern  in  the  eight  industries  in  which  no  difference  was  found,  since 
conditioning  on  firm  effects  would  decrease  the  likelihood  that  significant 
differences  among  analysts  will  be  found.   However,  it  is  a  potential  concern 
in  the  Chemicals  Industry  (SIC  28),  where  the  distribution  of  analysts  is 
significantly  different  from  the  expected  distribution  under  the  null 
hypothesis.   In  this  case,  the  question  remains  whether  the  differences  are 
attributable  to  firm  effects  which  have  been  ignored  in  the  non-parametric 
tests. 

Evidence  on  the  relative  importance  of  firm  effects  in  the  Chemical  Industry 
is  available  in  Table  3,  where  the  results  of  the  parametric  tests  in  the  fixed 
effects  model  are  reported.  While  all  nine  industries  show  statistically 
significant  firm  effects  and  year  effects,  the  relative  strength  of  firm 
effects  and  relative  weakness  of  year  effects  is  greatest  for  SIC  28,  the 
Chemical  Industry.  This  suggests  that  the  observed  differences  in  accuracy 
across  analysts  obtained  in  non-parametric  tests  for  the  Chemical  Industry  may 
be  attributable  to  firm  effects.   I  am  currently  investigating  this 
possibility. 

One  approach  to  the  examination  of  whether  firm  effects  drive  the  results  in 
the  Chemical  Industry  is  simply  to  reverse  the  roles  of  firms  and  years  in 
constructing  the  expected  distribution.  That  is,  rank  analysts  on  the  basis  of 
average  accuracy  across  years,  for  each  firm.  Then  treat  each  firm  as  an 
independent  observation,  and  presume  that  all  analysts  are  alike,  and  have 
equal  probabilities  of  falling  in  each  fractile  with  each  firm.   The 
justification  for  this  approach  in  the  case  of  the  Chemical  Industry  is  the 
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relative  strength  of  firm  effects  and  relative  weakness  of  year  effects 
observed  in  estimating  the  fixed  effects  model. 


5.   Discussion  and  Sunraary 

In  both  parametric  and  non-parametric  tests  of  individual  analysts'  accuracy 
within  industries,  the  overall  conclusion  is  that  analysts  do  not  exhibit 
consistent  differences  in  forecasting  ability.   One  implication  of  this  result 
is  that  attempts  to  improve  the  accuracy  of  consensus  measures  by  weighting 
relative  to  prior  precision  are  unlikely  to  succeed.   However,  the  result  also 
raises  a  question:   why  are  so  many  analysts  engaged  in  forecasting  earnings? 

An  explanation  may  lie  in  the  difference  between  the  academic's  historical 
perspective  and  the  investor's  interest  in  future  events.   The  research 
question  addressed  by  this  study  is  primarily  motivated  by  the  typical  academic 
use  of  analysts'  forecast  data:   to  determine,  on  a  given  date,  a  proxy  for  the 
market  expectation  of  a  firm's  earnings  from  the  set  of  forecasts  available. 
That  is,  academic  use  of  forecast  data  concerns  properties  of  the  point 
estimate  of  earnings,  given  the  cross-section  available  at  any  time.   In 
contrast  to  this,  investors'  use  of  analysts'  forecasts  of  earnings  presumably 
requires  both  accuracy  of  the  point  estimate,  and  timely  incorporation  of 
important  new  information.   It  may  be  the  case  that,  once  announced  by  an 
informed  analyst,  an  updated  point  estimate  is  easy  for  other  analysts  to 
mimic,  or  to  incorporate  into  their  own  estimates.   While  the  point  estimate 
can  be  mimicked,  presumably  it  is  more  difficult  to  mimic  the  timing  of 
informed  updating. 

If  the  above  circumstances  are  descriptive  of  the  process  by  which  analysts 
forecasts  are  generated,  then,  viewed  ex  post  as  in  academic  studies,  it  is  not 
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particularly  surprising  that  cross-sections  of  forecasts  taken  at  an  arbitrary 
time  do  not  display  consistent  differences  in  analysts'  abilities  to  make  point 
estimates.   In  the  scenario  described  above,  analysts  would  not  compete  on  the 
basis  of  the  accuracy  of  their  point  estimates,  but  rather  would  compete  on  the 
basis  of  timely  incorporation  of  new  information. 

Although  the  timing  issue  is  a  matter  of  indifference  for  the  academic 
purpose  of  defining  the  market  expectation  at  any  time  using  ex  post  data,  it 
would  be  interesting  to  learn  whether  the  above  circumstances  describe  the 
process  of  forecast  updating.   This  is  the  subject  of  continuing  research. 
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Table  1 
Sairple  Selection  in  Nine  Industries 


SIC 

Industry  Name 

Original 
Sample 

Final 
Sample^ 

%  of 
Original 
in  Final 

26 

Paper  and  allied 
products 

forecasts 

analysts 

firms 

1871 

362 

17 

826 
29 
15 

44.1% 

8.0 
88.2 

28 

Chemicals  and  allied 
products 

forecasts 

analysts 

firms 

5359 

523 

43 

2900 

104 

43 

54.1% 
19.9 
100.0 

29 

Petroleum  refining 
and  related  indust- 
ries 

forecasts 

analysts 

firms 

2958 

282 

21 

1775 
50 
20 

60.0% 

17.7 

95.2 

33 

Primary  metal 
industries 

forecasts 

analysts 

firms 

1054 

153 

12 

524 
26 
11 

49.7% 

17.0 

91.7 

35 

Machinery,  except 
electrical 

forecasts 

analysts 

firms 

2902 

635 

29 

1106 
63 
26 

38.1% 

9.9 
89.7 

36 

Electrical  and 
electronic  machinery 

forecasts 

analysts 

firms 

2064 

503 

28 

641 
41 
24 

31.1% 

8.2 
85.7 

49 

Electric,  gas  and 
sanitary  services 

forecasts 

analysts 
firms 

4664 

262 

63 

2888 
49 
63 

61.9% 
18.7 
100.0 

60 

Banking 

forecasts 

analysts 

firms 

2596 

178 

36 

1594 
41 
36 

61.4% 
23.0 
100.0 

63 

Insurance 

forecasts 

analysts 

firms 

1076 

138 

9 

492 

19 

9 

45.7% 
13.8 
100.0 

1.  The  nine  industries  examined  are  2-digit  SIC  industries  for  which  the  most 
data  were  available  in  the  I/B/E/S  database. 

2.  The  original  sample  consists  of  forecasts  for  firms  with  December  year 
ends,  with  at  least  one  forecast  in  the  I/B/E/S  database  in  each  year, 
1975-1981. 

3.  The  final  sample  consists  of  forecasts  from  analysts  with  at  least  10 
forecasts  for  any  firms  in  the  industry,  and  with  forecasts  in  at  least  3  of 
the  years  1975-1981. 


Table  2 
Industry  Following  m  Nine  Industries  , 


Original  Sample' 


SIC 

firms 
tt    % 

17    6.6 

fore 
# 

1871 

casts 

forecasts 
per  firm 

26 

7.6 

110.1 

28 

43 

16.7 

5359 

21.8 

124.6 

29 

21 

8.1 

2958 

12.1 

140.9 

33 

12 

4.7 

1054 

4.3 

87.8 

35 

29 

11.2 

2902 

11.8 

100.1 

36 

28 

10.9 

2064 

8.4 

73.7 

49 

63 

24.4 

4664 

19.0 

74.0 

60 

36 

14.0 

2596 

10.6 

72.1 

63 

9 

3.5 

1076 

4.4 

119.6 

Final  Sample 


fi 

.rms 

% 

6, 

,1 

forec 
# 

826 

:asts 

fo 
Pe 

recasts 
r  firm 

15 

5.5 

55.1 

43 

17. 

.4 

2900 

22.8 

67.4 

20 

8, 

.1 

1775 

13.9 

88.8 

11 

4, 

.5 

524 

4.1 

4'7.6 

26 

10, 

.5 

1106 

8.7 

42.5 

24 

9, 

.7 

641 

5.0 

26.7 

63 

25, 

.5 

2888 

22.7 

45.8 

36 

14, 

.6 

1594 

12.5 

44 . 3 

9 

3, 

,6 

492 

3.9 

54.7 

Totals 


258  100.0   24544  100.0 


247   100.0  12746   100.0 


1.  The  nine  industries  examined  are  2-digit  SIC  industries  for  which  the  most  data 
were  available  in  the  I/B/E/S  database. 

2.  Columns  labelled  "#"  report  the  number  of  firms  or  forecasts  in  the  sample  for 
each  industry.   Columns  labelled  "%"  report  the  proportion  of  the  sample  in  each 
industry. 

3.  The  original  sample  consists  of  forecasts  for  firms  with  December  year  ends,  v;ith 
earnings  data  available  on  COMPUSTAT,  with  at  least  one  forecast  in  the  I/B/E/S 
database  in  each  year,  1975-1981. 

4.  The  final  sample  consists  of  forecasts  from  analysts  with  at  least  10  forecasts 
for  any  firms  in  the  industry,  and  with  forecasts  in  at  least  3  of  the  years  1975- 
1981. 


Table  3 
Estimation  of  Absolute  Forecast  Error  Model  with  Analyst,  Firm  and  Year 


Fixed  Effects,  For  Nine  Industries 


4jt 


Pi  +  6j  +  y^.  +  ^ijt 


(2) 


Analysts^ 
Hi    F 

Firms^ 

Years'^ 
"t   F 

Model^ 

SIC  , 

N 

R'^ 

F 

26 

29 

0.47 

15 

12.45 

7 

13.70 

826 

.27 

5.83 

28 

104 

0.79 

43 

16.99 

7 

2.04 

2900 

.32 

8.73 

29 

50 

0.91 

20 

12.12 

7 

30.80 

1775 

.24 

7.43 

33 

26 

0.76 

11 

4.22 

7 

7.98 

524 

.21 

3.11 

35 

63 

0.87 

26 

3.33 

7 

5.26 

1106 

.16 

2.09 

36 

41 

1.08 

24 

11.93 

7 

8.32 

641 

.43 

6.36 

49 

49 

1.03 

63 

13.99 

7 

4.95 

2888 

.29 

9.81 

60 

41 

0.49 

36 

5.30 

7 

6.02 

1594 

.14 

3.15 

63 

19 

0.83 

9 

21.34 

7 

6.70 

492 

.36 

7.92 

1.  The  estimated  equation  is  equation  (2)  in  the  text.   The  dependent 
variable  I^jj^^I  is  the  absolute  value  of  the  forecast  error  from  analyst  i 
on  firm  j's  earnings  for  year  t,  where  the  forecast  is  taken  at  a  horizon  of 
120  trading  days  prior  to  the  annual  earnings  announcement.   The  equation  is 
estimated  separately  for  each  of  the  nine  industries. 

2.  n.  denotes  the  number  of  analysts  in  the  sample  in  each  industry.   The 
reported  F  statistics  test  the  hypothesis  that,  in  each  industry,  analysts 
are  homogeneous  in  average  forecast  accuracy,  conditional  on  the  firms  and 
years  in  the  sample.   The  F  statistics  have  (n^  -  1)  and 

(N  -  n.  -  n  ■  -  n^.  +  2)  degrees  of  freedom.   Nominal  significance  (a)    levels 
for  the  reported  Fs  range  from  .34  to  .99. 

3.  n-  denotes  the  number  of  firms  in  the  sample,  by  industry.   The  reported 
F-statistics  test  homogenity  of  average  forecast  accuracy  across  firms, 
conditional  on  the  analysts  and  years  in  the  sample.   The  F-statistics  have 
(n-  -  1)  and  (N  -  n^  -  n.  -  n^.  +  2)  degrees  of  freedom.   Nominal 
significance  (a)  levels  for  the  reported  Fs  are  smaller  than  .0001. 


4.  n^^  denotes  the  number  of  years  in  the  sample,  by  industry.   The  reported 
F-statistics  test  homogenity  of  average  forecast  accuracy  across  years 
conditional  on  the  analysts  and  firms  in  the  sample.   The  F-statistics  have 
6  and  (N  -  n^  -  n^  -  n^^  +  2)  degrees  of  freedom.   Nominal  significance  (a) 
levels  for  the  reportea  Fs  are  smaller  than  .0001  for  all  industries  except 
SIC  28,  for  which  a  =  .0567. 


5.  N  denotes  the  number  of  sample  observations  by  industry.   Unadjusted  R 
is  reported  for  each  industry,  along  with  the  F  statistic  on  the  full  model, 
which  has  (n-  +  n-  +  n^^  -  3)  and  (N  -  n.  -  n.  -  n^.  +  2)  degrees  of  freedom. 
Nominal  significance  (a)  levels  for  the  reported  Fs  are  smaller  than  .0001. 


Table  4 
Estimation  of  Absolute  Forecast  Error  Model  with  Only  Analyst  Fixed  Effects, 


For  Nine  Industries: 

^ijt  i  =  ^i  +  "ijt      (3) 


Panel  A:   Summary  of  Regression  Results' 


SIC 

29 

N 
826 

F 
0.64 

Prob  > 
.92 

F 

R^ 

26 

.02 

28 

104 

2900 

4.65 

<.01 

.15 

29 

50 

1775 

2.16 

<.01 

.06 

33 

26 

524 

1.40 

.10 

.07 

35 

63 

1106 

1.22 

.12 

.07 

36 

41 

641 

1.77 

<.01 

.11 

49 

49 

2888 

3.75 

<.01 

.06 

60 

41 

1594 

0.74 

.88 

.02 

63 

19 

492 

1.80 

.02 

.06 

Panel 

B:    Comparison 

of  Estimates, 

Model 

s  (2)  and  (3)^ 

Model 

(2)  :   3 

Factor  Model 

Model 

(3)  : 

1  Factor 

Model 

SIC 

U^  max 

U^   min 

-^- 

-Sn- 

M^   max 

Ma   min 

^.- 

-^u- 

26 

0.91 

0.29 

0.19 

0.80 

1.31 

0.39 

0.19 

0.92 

28 

0.98 

0.09 

0.21 

0.77 

1.35 

0.05 

0.18 

0.85 

29 

1.75 

0.54 

0.23 

1.13 

2.15 

0.40 

0.24 

1.25 

33 

2.59 

0.95 

0.46 

1.68 

2.76 

0.90 

0.42 

1.80 

35 

1.65 

-0.09 

0.60 

1.76 

2.55 

0.32 

0.45 

1.82 

36 

0.47 

-0.01 

0.09 

0.27 

0.58 

0.09 

0.09 

0.33 

49 

0.39 

0.16 

0.04 

0.20 

0.55 

0.16 

0.04 

0.22 

60 

0.68 

0.19 

0.19 

0.89 

0.86 

0.12 

0.18 

0.94 

63 

0.89 

0.49 

0.11 

0.49 

0.97 

0.15 

0.13 

0.58 

1.  For  equation  (3),  the  average  absolute  forecast  error  for  each  analyst  is 
computed  via  least  squares,  ignoring  firm  and  year  effects. 

2.  The  reported  regression  results  are  for  nine  2-digit  SIC  industries,  each 
estimated  separately,   n^  denotes  the  number  of  analysts  in  the  sample  by 
industry.   N  denotes  the  number  of  forecast  observations  by  industry.   The  F 
statistic,  with  (n^  -  1)  and  (N  -  n.)  degrees  of  freedom,  tests  whether 
analysts  are  homogeneous  in  forecast  accuracy  in  Model  (3). 

3.  Equation  (2)  estimates  average  absolute  forecast  error  for  each  analyst, 
conditional  on  the  firm  and  year  effects  in  the  sample.   Equation  (3) 
ignores  the  firm  and  year  effects.   The  estimates  reported  for  each  model 
are:  the  maximum  and  minimum  average  absolute  forecast  error  for  an  analyst 
{v-   max  and  ]!■    min);  the  average,  over  analysts,  of  the  standard  errors  on 
average  forecast  errors  (S7)  ;  and  the  residual  standard  error  (S  and  S^^). 


Table  5 

Analysis  of  Residuals  From  the  Three  Factor  Fixed  Effects  Model: 
Estimated  Separately  For  Nine  Industries 


e 


ijt  i  =  Vi  +  JTj  +  6t  +  ^ijt     (2) 


Skewness 

Kurtosis 

Test  for 

Probability  of 

SIC 

Coefficient 

Coefficient"^ 

Normality 

Typi 

e  I  error 

26 

3.22 

16.82 

0.16 

<.01 

28 

5.11 

52.91 

0.22 

<.01 

29 

3.28 

14.63 

0.18 

<.01 

33 

3.41 

19,25 

0.14 

<.01 

35 

6.23 

55.45 

0.21 

<.01 

36 

2.41 

15.03 

0.12 

<.01 

49 

1.13 

4.48 

0.07 

<.01 

60 

21.56 

673.93 

0.23 

<.01 

63 

1.32 

6.19 

0.07 

<.01 

1.  The  fixed  effects  model  is  equation  (2)  in  the  text.  The  estimation 
results  are  described  in  Table  3. 

2.  The  skewness  coefficient  is  computed:  [  n/(n-l)  (n-2)  ]  !•  (ti.)   /  Sti  . 
For  normally  distributed  data,  the  value  of  this  statistic  will  be  close  to 
zero.   Larger  values  indicate  right-skewness . 

3.  The  kurtosis  coefficient  is  computed: 

{[n(n+l)  /  (n-l)(n-2)(n-3)]  li{r\^r  I  ^^\  "  3(n-l)2  /  (n-2)(n-3).  For 
normally-distributed  data,  the  value  of  this  statistic  will  be  close  to 
zero.   Larger  values  indicate  leptokurtosis . 

4.  The  test  for  normality  is  the  Kolmogorov  D  statistic  to  test  whether  the 
data  are  distinguishable  from  a  normal  distribution  with  the  same  mean  and 
variance  as  the  sample.   Normality  is  rejected  in  all  cases. 


Table  6 

Cofnparison  of  the  Observed  Distribution  of  Average  Ranks  with  the  Expected 
Distribution  if  All  Analysts  are  Alike,  with  Equal  Probabilities  of  Each 

Rank  in  Each  Year : 

The  Paper  Industry  (SIC  26)  as  an  Example 


#  of 

average 

Cumulative  Density 

analyst 

id  code 

years 

rank 

Observe 

d    Expected 

loifference | 

42  8415 

3 

1.00 

0.03 

0.01 

0.03 

38  90000 

5 

1.40 

0.07 

0.03 

0.04 

9  4790 

3 

1.67 

0.10 

0.09 

0.02 

45  18700 

4 

1.75 

0.13 

0.12 

0.02 

12  41988 

3 

2.00 

0.17 

0.26 

0.08 

16  37978 

7 

2.00 

0.21 

0.26 

0.05 

36  1493 

5 

2.00 

0.24 

0.26 

0.02 

53  4500 

4 

2.00 

0.28 

0.26 

0.02 

37  44985 

4 

2.25 

0.31 

0.35 

0.04 

48  1200 

4 

2.25 

0.34 

0.35 

0.00 

65  1800 

4 

2.25 

0.38 

0.35 

0.03 

20  44636 

7 

2.29 

0.41 

0.36 

0.05 

12  31650 

3 

2.33 

0.45 

0.43 

0.02 

26  35250 

3 

2.33 

0.48 

0.43 

0.05 

13  24109 

5 

2.40 

0.52 

0.45 

0.06 

17  6361 

4 

2.50 

0.55 

0.53 

0.02 

23  23523 

7 

2.57 

0.59 

0.55 

0.04 

19  41566 

5 

2.60 

0.62 

0.57 

0.05 

1   42804 

3 

2.67 

0.66 

0.64 

0.01 

86  250 

3 

2.67 

0.69 

0.64 

0.05 

44  14497 

4 

2.75 

0.72 

0.71 

0.02 

10  448 

5 

2.80 

0.76 

0.73 

0.03 

26  35068 

4 

3.00 

0.79 

0.87 

0.07 

67  3600 

4 

3.00 

0.83 

0.87 

0.04 

4  9031 

3 

3.33 

0.86 

0.95 

0.08 

46  2511 

6 

3.50 

0.90 

0.97 

0.07 

9  4382 

4 

3.50 

0.93 

0.97 

0.04 

5   27755 

3 

3.67 

0.97 

0.99 

0.02 

81  4000 

3 

4.00 

1.00 

1.00 

0.00 

Maximum  Absolute  Difference:     0.08 
Kolmogorov  -  Smirnov  Statistic:     0.37 


Table  7 


Computation  of  the  Expected  Frequency  of  Average  Ranks  if  All  Analysts  are 
Alike,  with  Equal  Probabilities  of  Each  Rank  in  Each  Year: 

The  Paper  Industry  (SIC  26)  as  an  Example''^ 


T  =  3  ,  N3  = 

10 

T  =  4  ,  N4  = 

10 

average 
rank 

probability 
under  null 

expected 
frequency 

average 
rank 

probability 
under  null 

expected 
frequency 

1.00 

0.016 

0.16 

1.00 

0.004 

0.04 

1.33 

0.047 

0.47 

1.25 

0.016 

0.16 

1.67 

0.094 

0.94 

1.50 

0.039 

0.39 

2.00 

0.156 

1.56 

1.75 

0.078 

0.78 

2.33 

0.188 

1.88 

2.00 

0.121 

1.21 

2.67 

0.188 

1.88 

2.25 

0.156 

1.56 

3.00 

0.156 

1.56 

2.50 

0.172 

1.72 

3.33 

0.094 

0.94 

2.75 

0.156 

1.56 

3.67 

0.047 

0.47 

3.00 

0.121 

1.21 

4.00 

0.016 

0.16 

3.25 

0.078 

0.78 

3.50 

0.039 

0.39 

3.75 

0.016 

0.16 

4.00 

0.004 

0.04 

1.  This  table  demonstrates  the  computation  of  the  expected  frequency  of 
analysts'  average  ranks,  using  the  Paper  industry  as  an  example. 
Computations  are  shown  for  the  cases  of  analysts  with  1=3  and  T=4  years. 
The  number  of  analysts  in  the  Paper  industry  sample  with  forecasts  in  3 
years  (N^)  is  10.   The  number  with  forecasts  in  4  years  (N^)  is  also  10. 
Similar  computations  (not  shown)  are  carried  out  for  T=5,  6  and  7,  for  which 
the  numbers  of  analysts  are  Nr  =  5,  Nc  =  l ,  N-,=3 . 


2.   Analysts  are  ranked  in  quartiles  in  each  year, 
average  ranks,  over  T  years,  is  3T  +  1. 


The  number  of  different 


3.  The  number  of  distinct  sets  of  quartile  ranks  of  length  T  is  4  .   The 
probability  of  an  average  rank,  under  the  null  hypothesis  that  each  guartile 
is  equally  likely  in  each  of  the  T  years,  is  the  proportion  of  the  4   sets 
of  ranks  with  that  average. 


4.  The  expected  frequency  in  a  sample  of  size  Nj  is  the  probability  times 


Table  8 


Computation  of  the  Density  and  Cumulative  Density  for  the  Expected  Frequency 

of  Average  Ranks: 

The  Paper  Industry  (SIC  26)  as  an  Example 


avg. 

exp. 

density 

cum. 

avg. 

exp. 

cum. 

ranlc^ 

freq-^ 

density^ 

rank 

freq. 

density 

density 

1.00 

0.201 

0.007 

0.007 

2.50 

1.860 

0.064 

0.532 

1.14 

0.001 

0.000 

0.007 

2.57 

0.390 

0.013 

0.546 

1.17 

0.001 

0.000 

0.007 

2.60 

0.757 

0.026 

0.572 

1.20 

0.024 

0.001 

0.008 

2.67 

2.008 

0.069 

0.641 

1.25 

0.  156 

0.005 

0.013 

2.71 

0.351 

0.012 

0.653 

1.29 

0.005 

0.000 

0.013 

2.75 

1.563 

0.054 

0.707 

1.33 

0.474 

0.016 

0.030 

2.80 

0.659 

0.023 

0.730 

1.40 

0.073 

0.003 

0.032 

2.83 

0.  Ill 

0.004 

0.733 

1.43 

0.015 

0.001 

0.033 

2.86 

0.285 

0.010 

0.743 

1.50 

0.404 

0.014 

0.047 

3.00 

3.555 

0.  123 

0.866 

1.57 

0.037 

0.001 

0.048 

3.14 

0.133 

0.005 

0.870 

1.60 

0.171 

0.006 

0.054 

3.17 

0.053 

0.002 

0.872 

1.67 

0.967 

0.033 

0.087 

3.20 

0.317 

0.011 

0.883 

1.71 

0.076 

0.027 

0.090 

3.25 

0.781 

0.027 

0.910 

1.75 

0.781 

0.011 

0.117 

3.29 

0.076 

0.003 

0.913 

1.80 

0.317 

0.002 

0.128 

3.33 

0.967 

0.033 

0.946 

1.83 

0.053 

0.002 

0.130 

3.40 

0.171 

0.006 

0.952 

1.86 

0.133 

0.005 

0.134 

3.43 

0.037 

0.001 

0.953 

2.00 

3.555 

0.123 

0.257 

3.50 

0.404 

0.014 

0.967 

2.14 

0.285 

0.010 

0.267 

3.57 

0.015 

0.001 

0.968 

2.17 

0.111 

0.004 

0.270 

3.60 

0.073 

0.003 

0.970 

2.20 

0.659 

0.023 

0.293 

3.67 

0.474 

0.016 

0.987 

2.25 

1.563 

0.054 

0.347 

3.71 

0.005 

0.000 

0.987 

2.29 

0.351 

0.012 

0.359 

3.75 

0.156 

0.005 

0.992 

2.33 

2.008 

0.069 

0.428 

3.80 

0.024 

0.001 

0.993 

2.40 

0.757 

0.026 

0.454 

3.83 

0.001 

0.000 

0.993 

2.43 

0.390 

0.013 

0.468 

3.86 

0.001 

0.000 

0.993 

4.00 

0.201 

0.007 

1.000 

1.  The  computation  of  expected  frequency  is  illustrated  in  Table  7.   The  distributions 
are  expected  distributions  of  average  quartile  rankings  through  time,  conditional  on 
the  number  of  analysts  in  the  Paper  industry  sample. 


2.  Analysts  are  ranked  in  quartiles  each  year.   Average  ranks  are  computed  through 
time,  for  T=3,...,7  years. 

3.  The  computation  of  expected  frequencies  for  T=3  and  T=4  years  is  shown  in  Table  7. 
The  expected  frequencies  listed  here  are  aggregated  for  T=3,...,7  years. 

4.  The  density  is  the  expected  frequency  divided  by  29,  the  number  of  analysts  in  the 
Paper  Industry  sample. 

5.  "Cum.  density"  is  the  cumulative  expected  density. 


Table  9 


Kolmogorov-Smlmov  Test  Statlsitics  for  Nine  Industries,  Testing  the  Null  Hypothesis 

that  the  Distribution  of  Average  Quartlle  Rankings  Is  Indistinguishable  from  the 
Expected  Distribution  if  All  Analysts  are  Alike,  and  Each  Quartlle  is  Equally  Likely. 


Panel  A: 

Kol 

mogorov  - 

Smirnov  Test 

SIC 

L 

analysts 

Dmn 

26 

29 

0.08 

28 

104 

0.24 

29 

50 

0.  12 

33 

26 

0.  13 

35 

63 

0.15 

36 

41 

0.  16 

49 

49 

0.20 

60 

41 

0.  18 

63 

19 

0.12 

KS 


0.37 
1.42 
0.63 
0.56 
0.84 
0.  79 
1.03 
0.89 
0.47 


Panel  B: 


Critical  Points  and  Upper  Tail  Probabilities  for  tlie 
Limiting  Distribution  of  KS 


c 

P  (KS>c) 

0.30 

0.0000 

0.40 

0.0028 

0.50 

0.0361 

0.60 

0. 1357 

0.70 

0.2888 

0.80 

0.4559 

0.90 

0.6073 

c 

P  (KS>c) 

1.00 

0.7300 

1.10 

0.8223 

1.20 

0.8878 

1.30 

0.9319 

1.40 

0.9603 

1.50 

0.9778 

1.60 

0.9880 

1.70 
1.80 
1.90 


00 
10 
20 


P  (KS>c) 


0.9938 
0.9969 
0.9985 
0.9993 
0.9997 
0.9999 


1.  Dj^j^  denotes  the  absolute  value  of  the  maximum  difference  between  the  observed  and 
expected  cumulative  densities.   KS  denotes  the  Kolmogorov-Smirnov  test  statistic. 

2.  Values  in  this  table  are  taken  from  DeGroot  (1986)  p. 555. 
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